Search for a command to run...
Kevin Wang and his Princeton team challenge conventional wisdom in reinforcement learning by scaling neural networks to 1000 layers using a self-supervised objective that transforms RL into a classification problem, demonstrating performance gains through architectural innovations like residual connections and layer normalization.